docs(plans)+test: rebaseline #497 OCR plans to #498 + gating probes (5-specialist framing)#500
Conversation
…5-specialist framing) Five specialists (cascade / family-codec / palette / dto-soa / truth-architect) framed the merged #497 OCR-transcode plans against the post-#498 substrate. Two showstoppers + 6-way drift; all 7 plans corrected: - HelixResidue 48 B → 6 B everywhere (a stored Signed360 index, not a 48-byte field); budgets/carve rebaselined (Full 112, [32,144)); headers #496 → #498. - "Morton-tile stacked-pyramid perturbation-shader" purged (does not exist; Morton rejected for Hilbert) → real primitives (mipmap pyramid / HHTL depth-cascade / CAKES). - "reversible without a hash" reframed: no residue→rank inverse exists; node = identity → content-store lookup, codebook = repair signal (I-VSA-IDENTITIES). - §0 tripwires: no ValueSchema::Ocr variant (ride Full/Compressed); Meta de-overloaded (confidence→Energy, provenance→Plasticity, OOV→content-store); TurbovecResidue is the edge codec, glyph→word uses DeepNSM CamCodes. - master critical path 42→53 becomes 42→{50,51}→53 (resolves the open #497 CodeRabbit Major). New ocr-probes-v1.md specs the 4 gating probes (OCR-RT/DET/POST/SCHEMA) for the unmeasured claims (int8-exact LSTM, bit-reproducible diff, 200k-LOC 1:1 layout). OCR-SCHEMA shipped as a contract test proving OCR rides an existing preset. EPIPHANIES E-OCR-PLAN-DRIFT-1 + AGENT_LOG entry. contract lib green; fmt clean. https://claude.ai/code/session_01D2WSmezQBNC3bUdHuGfGmo
📝 WalkthroughWalkthroughRebaselines seven OCR transcode plan documents from post-#497 to post-#498 architecture, removing two showstoppers (false reversibility rationale and nonexistent Morton-tile cascade). Adds ChangesOCR Transcode Rebaseline and OCR-SCHEMA Shipment
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 24d3fd843a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.claude/plans/ocr-probes-v1.md:
- Around line 30-33: The 99% pass threshold in the OCR-RT probe definition
allows lossy residue→rank mappings to incorrectly pass the reversibility gate.
Replace the threshold-based pass criterion (≥ 99%) with an exact requirement
that fails on any miss, ensuring only perfect round-trips satisfy this gate. If
tolerance is desired, separate it into a distinct quality probe rather than
mixing it into the reversibility assertion, so the corrected plans' claim about
text-as-identity and codebook-as-repair-signal can be properly validated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 2324b59e-e828-4fee-8cd8-4c7e02a16568
📒 Files selected for processing (8)
.claude/board/AGENT_LOG.md.claude/board/EPIPHANIES.md.claude/plans/ocr-canonical-soa-integration-v1.md.claude/plans/ocr-probes-v1.md.claude/plans/soa-centroid-attention-field-synthesis-v1.md.claude/plans/tesseract-rs-ast-dll-codegen-v1.md.claude/plans/tesseract-rs-transcode-master-v1.mdcrates/lance-graph-contract/src/ocr.rs
…gate is exact Two review threads on the merged #500: - codex P2: "post-POC OCR rides Compressed" was wrong — Compressed lacks Energy+Plasticity, so the schema-gated transcode would silently drop confidence (→Energy) and repair-provenance (→Plasticity). Corrected: OCR rides Full (the only preset with the codec residues AND the hot lifecycle columns). The OCR-SCHEMA contract test now asserts Compressed lacks Energy/Plasticity (regression guard). - CodeRabbit Major: OCR-RT reversibility gate tightened 99% → 100% exact (a lossy residue→rank map is NOT "reversible"; tolerance moved to a separate quality probe). https://claude.ai/code/session_01D2WSmezQBNC3bUdHuGfGmo
fix(plans)+test: #500 review — OCR rides Full not Compressed; OCR-RT gate exact
Follow-up to the merged #497 (OCR-transcode plans) + #498 (helix
Signed360+ GUID keystone). Five specialists (cascade-architect / family-codec-smith / palette-engineer / dto-soa-savant / truth-architect) framed the merged #497 plans against the post-#498 substrate; this PR applies the corrections + specs the gating probes. Consolidated framing:EPIPHANIES.mdE-OCR-PLAN-DRIFT-1.Two showstoppers the framing caught
residue→rankinverse exists;deepnsm/vocabulary.rsis a stored string-table keyed by rank, every decode takes a known rank as input. Reframed: node = identity → content-store lookup; codebook = repair signal (I-VSA-IDENTITIES), not reversible text.framebuffer::build_mipmap_pyramid/splat3d/depth_cascade/ CAKES ladder).Plan corrections (all 7 #497 docs)
Signed360place index, not a 48-byte field); budgets/carve rebaselined (Full 154→112,[32,186)→[32,144)); headerspost-#496→post-#498.ValueSchema::Ocr(rideFull/Compressedor mint a class); de-overloadedMeta(confidence→Energy, provenance→Plasticity, OOV→content-store); flaggedTurbovecResidueas the edge codec (rank-only) — glyph→word uses DeepNSM CamCodes.LayoutBlock::to_node_rowlanded in feat(contract): GUID decode→read-mode keystone + helix Signed360 right-size + OCR→NodeRow transcode #498) — re-cast as "extend", not "build".classid-marked as layout-addressed (forgoing the similarity-basin reading).42 → 53→42 → {50,51} → 53— resolves the open CodeRabbit Major on docs(plan): Tesseract → tesseract-rs 1:1 transcode (LSTM hosted via embedanything) — v2 #497.floor_version.Probes (gate the unmeasured claims)
New
ocr-probes-v1.mdspecs 4 gating probes — OCR-RT (residue→rank round-trip), OCR-DET (repair determinism), OCR-POST (GGUF posterior parity), OCR-SCHEMA (ValueSchema fit) — plus 3 cascade perf probes. The big claims (int8-exact LSTM, bit-reproducible diff, ~200k-LOC 1:1 layout) are CONJECTURE until these run.ocr::tests::ocr_schema_fit_rides_existing_preset_no_new_variant) — proves OCR ridesCompressed/Full, no new enum variant.Board:
EPIPHANIES.mdE-OCR-PLAN-DRIFT-1,AGENT_LOG.mdentry. contract lib green; clippy/fmt clean.https://claude.ai/code/session_01D2WSmezQBNC3bUdHuGfGmo
Generated by Claude Code
Summary by CodeRabbit
Bug Fixes
Documentation
Tests